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Human sequence of the non-coding RNA gene (including the 
putative promoter) 

1 CTTAGAGTTT - CGTGGC TTCA GGGTGGGAGT AGTTGGAGCA TTGGGGATGT 

51 TTTTCTTACC ~G ACAAGCAC A GTCAGGTTGA AGACCTAACC AGGGCCAGAA 

101 GTAGCTTTGC ACTTTTCTAA ACTAGGCTCC TTCAACAAGG CTTGCTGCAG 

151 ATACTACTGA CCAGACAAGC TGTTGACCAG GCACCTCCCC TCCCGCCCAA 

. 2 01 ACCTTTCCCC CATGTGGTCG TTAGAGACAG AGCGACAGAG CAGTTGAGAG 

2 51 GACACTCCCG TTTTCGGTGC CATCAGTGCC CCGTCTACAG CTCCCCCAGC 

3 01 TCCCCCCACC TCCCCCACTC CCAACCACGT TGGGACAGGG AGGTGTGAGG 
3 51 CAGGAGAGAC AGTTGGATTC TTTAGAGAAG ATGGATATGA CCAGTGGCTA 
401 TGGGCTGTGC GATCCCACCC GTGGTGGCTC AAGTCTGGCC CCACACCAGC 
451 CCCAATCCAA AACTGGCAAG GACGCTTCAC AGGACAGGAA AGTGGCACCT 

S~l 501 GTCTGCTCCA GCTCTGGCAT GGCTAGGAGG GGGGkGTCCG TTGAACTACT 

%J 551 GGGTGTAGAC TGGCCTG AAC CACAGGAGAG GA.TGGCCCAG GGTGAGGTGG 

ru 

fll 651 GGAGGCAGTA GGACAAGGTG CAGGCAGGCT GGCCTGGGGT CAGGCGGGGC 

m 
s 
k- 

□ 



501 CATGGTCCAT TC TC AAGGG A CGTCCTCCAA CGGGTGGCGC TAGAGGCCAT 



7G1 AGAGCACAGC GGGGTGAGAG GGATTCCTAA TCACTCAGAG CAGTCTGTGA 

751 CTTAGTGGAC AGGGGAGGGG GCA-AAGGGG-G AGGAGAAGAA AATGTTCTTC 

801 CAG TTACTTT CCAATTCTGC TTTAGGGACA GCTTAGAATT ATTTGCACTA 

851 TTGAGTCTTC ATGTTCCCAC TTCAAAACAA ACAGATGCTC TGAGAGCAAA 

901 CTGGCTTGAA TTGGTGACAT TTAGTCCCTC AAGCCACCAG ATGTGACAGT 

951 GTTGAGAACT ACCTGGATTT GTATATATAC CTGCGCTTGT TTTAAAGTGG 

1001 GCTCAGCACA TAGGGTTCCC ACGAAGCTCC GAAACTCTAA GTGTTTGCTG 

1051 CAATTTTATA AGGACTTCCT GATTGGTTTC TCTTCTCCCC TTCCATTTCT 

1101 GCCTTTTGTT C ATTTC ATC C TTTCACTTCT TTCCCTTCCT CCGTCCTCCT 

1151 CCTTCCTAGT TCATCCCTTC TCTTCCAGGC KGCCGCGGTG CCC AACCACA 

1201 CTTGTCGGCT CCAGTCCCCA GAACTCTGCC TGCCCTTTGT CCTCCTGCTG 

12 51 CCAGTACCAG CCCCACCCTG TTTTGAGCCC TGAGGAGGCC TTGGGCTCTG 
1301 CTGAGTCCAA CCTGGCCTGT CTGTGAAGAG CAAGAGAGCA GCAAGGTCTT 

13 51 GCTCTCCTAG GTAGCCCCCT CTTCCCTGGT AAGAAAAAGC AAAAGGCATT 
1401 TCCCACCCTG AACAACGAGC CTTTTCACCC TTCTACTCTA GAGAAGTGGA 
1451 CTGGAGGAGC TGGGCCCGAT TTGGTAGTTG AGGAAAGCAC AGAGGCCTCC 
1501 TGTGGCCTGC CAGTCATCGA GTGGCCCAAC AGGGGCTCCA TGCCAGCCGA 
1551 CCTTC-ACCTC ACTCAGAAGT CCAGAGTCTA GCGTAGTGCA GCAGGGCAGT 
1601 AGCGGTACCA ATGCAGAACT CCCAAGACCC GAGCTGGGAC CAGTACCTGG 
1651 GTCCCCAGCC CTTCCTCTGC TCCCCCTTTT CCCTCGGAGT TCTTCTTGAA 
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1701 TGGCAATGTT TTGCTTTTGC TCGATGCAGA CAGGGGGCCA GAACACCACA 

17 51 CATTTCACTG TCTGTCTGGT CCATAGCTGT GGTGTAGGGG CTTAGAGGCA 

1801 TGGGCTTGCT GTGGGTTTTT AATTGATCAG TTTTCATGTG GGATCCCATC 

1851 TTTTTAACCT CTGTTCAGGA AGTCCTTATC TAGCTGCATA TCTTCATCAT 

19 01 ATTGGTATAT CCTTTTCTGT GTTTACAGAG ATGTCTCTTA TATCTAAATC 

1951 TGTCCAACTG AGAAGTACCT TATCAAAGTA GCAAATGAGA C AG C AG TC TT 

2001 ATGCTTCCAG AAACACCCAC AGGCATGTCC CATGTGAGCT GCTGCCATGA 

2051 ACTGTCAAG T GTGTGTTGTC TTG TGTATTT CAGTTATTGT CCCTGGCTTC 

2101 CTTACTATGG TG TAATC ATG AAGGAGTGAA ACATCATAGA AACTGTCTAG 

2151 CACTTCCTTG CCAGTCTTTA GTGATCAGGA AC CAT AG TTG ACAGTTCCAA 

22 CI TCAGTAGCTT AA.GA-AA-AAAC CGTGTTTC-TC TC TTC TG G AA TGGTTAGAAG 

22 51 TGAGGGAGTT TGCCCCGTTC TG TTTG TAG A GTCTCATAGT TGGACTTTCT 

22 01 AGCATATATG TG TCC ATTTC CTTATGCTGT AAAAGCAAGT CCTGCAACCA 

22 51 AACTCCCATC AGCCCAATCC CTGATCCCTG ATCCCTTCCA CCTGCTCTGC 

24 01 TGATGACCCC CCCAGCTTCA CTTCTC-ACTC TTCCCCAGGA AGGGAAGGGG 
2 451 GGTCAGAAGA GAGGGTGAGT CCTCCAGAAC TCTTCCTCCA AGGACAGAAG 

25 01 GCTCCTGCCC CCATAGTGGC CTCGAACTCC TGGCACTACC AAAGGACACT 
2 5 51 TATCCACGAG AGCGCAGCAT CCGACCAGGT TGTCACTGAG AA G A TGTTT A 
2 5 01 TTTTGGTCAG TTGGGTTTTT ATG TATTAT A C TT AG TC AAA TGTAATGTGG 

2 6 51 CTTCTGGAAT CA.TTGTCCAG AGCTGCTTCC CCGTCACCTG GGCGTCATCT 
27 01 GGTCCTGGTA AGAGGAGTGC GTGGCCCACC AGGCCCCCCT GTCACCCATG 
27 51 A.C AG TTC ATT CAGGGCCGAT GGGGCAGTCG TGGTTGGGAA CACAGCATTT 
2801 CAAGCGTCAC TTTATTTCAT TCGGGC C CCA CCTGCAGCTC CCTCAAAGAG 
23 51 GCAGTTGCCC AGCCTCTTTC CCTTCCAGTT TATTCCAGAG CTGCCAGTGG 
29 01 GGCCTGAGGC TCCTTAGGGT TTTCTCTCTA TTTCCCCCTT TCTTCCTCAT 
29 51 TCCCTCGTCT TTCCCAAAGG CATCACGAGT CAGTCGCCTT TCAGCAGGCA 

3 0 01 GCCTTGGCGG TTTATCGCCC TGGCAGGCAG GGGCCCTGCA GCTCTCATGC 
3051 TGCCCCTGCC TTGGGGTCAG GTTGACAGGA GGTTGGAGGG AAAGCCTTAA 
3101 GCTGCAGGAT TCTCACCAGC TGTGTCCGGC CCAGTTTTGG GGTCTGACCT 
3151 CAATTTCAAT TTTGTCTGTA CTTGAACATT ATGAAGATGG GGGCCTCTTT 

32 01 CAGTGAATTT GTGATCAGGA GAATTGACCG ACAGCTTTCC AGTACCCATG 
3 252 GGGCTAGGTC A.TTAAGGC C A CATC C AC AG T CTCCCCCACC C TTG TTC C AG 
3 2 01 TTG TT AG TT A CTACCTCCTC TCCTGACAAT ACTGTATGTC GTCGAGCTCC 

33 51 CCCCAGGTCT ACCCCTCCCG GCCCTGCCTG CTGGTGGGCT TGTCATAGCC 
3401 AGTGGGATTG CCGGTCTTGA CAGCTCAGTG AGCTGGAGAT ACTTGGTCAC 
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3 451 AGCCAGGCGC TAGCACAGCT CCCTTCTGTT GATGCTGTAT TCCCATATCA 

3501 AAAGGCACAG GGGACACCCA GAAACGCCAC ATCCCCCAAT CCATCAGTGC 

3551 CAAACTAGCC AACGGCCCCA GCTTCTCAGC TCGCTGGATG GCGGAAGCTG 

3 601 CTACTCGTGA GCGCCAGTGC. GGGTGCAGAC AATCTTCTGT TGGGTGGCAT 



37 01 AAATTGTCAC CTGCTTCTCT GCCCAGCTTT TCATTGCTGT GACAGTGATG 

3 751 GCGAAAGAGG GTAATAACCA GACACAAACT GCCAAGTTGG GTGGAGAAAG 

3 801 GAC-TTTCTTT AGCTGACAGA ATCTCTGAAT TTTAAATCAC TTAGTAAGCG 

3 851 GCTCAAGCCC AGGAGGGAGC AGAGGGATAC GAGCGGAGTC CCCTGCGCGG 

3 9 01 G AC CATC TGG AATTGGTTTA GCCC^.GTGG AGCCTGACAG CCAGAACTCT 

3 9 51 GTGTCCCCCG TCTAACCACA GCTCCTTTTC CAGAGCATTC CAGTCAGGCT 
4001 CTCTGGGCTG ACTGGGCCAG GGGAGGTTAC AGG T AC C AG T TCTTTAAGAA 
4051 GATCTTTGGG CATATACATT TTTAGCCTGT GTCATTGCCC CAAATGGATT 
4101 CCTGTTTCAA GTTCACACCT GCAGATTCTA GGACCTGTGT CCTAGACTTC 
4151 AGGGAGTCAG CTGTTTCTAG AGTTCCTACC ATGGAGTGGG TC TGG AGG AC 
42 01 CTGCCCGGTG GGGGGGCAGA GCCCTGCTCC CTCCGGGTCT TCCTACTCTT 

42 51 CTCTCTGCTC TGACGGGATT TGTTGATTCT CTCCATTTTG GTGTCTTTCT 

43 01 CTTTTAGATA TTGTATCAAT CTTTAGAAAA GGCATAGTCT ACTTGTTATA 

43 51 AATCGTTAGG ATACTGCCTC CCCCAGGGTC TAAAATTACA TATTAGAGGG 
4401 GAAAAGCTGA ACACTGAAGT CAGTTCTCAA CAATTTAGAA GGAAAACCTA 

44 51 GAAAACATTT GGCAGAAAAT TACATTTCGA TGTTTTTGAA TGAATACAAG 

4 501 CAAGCTTTTA CAACAGTGCT GATCTAAAAA TACTTAGCAC TTGGCCTGAG 

45 51 ATGCCTGGTG AGCATTACAG GCAAGGGGAA TCTGGAGGTA GCCGACCTGA 
4501 GGACATGGCT TCTGAACCTG TCTTTTGGGA GTGGT^.TGGA AGG TGG AGCG 
45 51 TTCACCAGTG ACCTGGAAGG CCCAGCACCA CCCTCCTTCC CACTCTTCTC 
4701 ATC TTG AC AG AGCCTGCGCC AGCGCTGACG TGTCAGGAAA ACACCCAGGG 
4751 AACTAGGAAG GCACTTCTGC CTGAGGGGCA GCCTGCCVTG CCCACTCCTG 
4 801 CTCTGCTCGC CTCGGATCAG CTGAGCCTTC TGAGCTGGCC TCTCACTGCC 

48 51 TCCCCAAGGC CCCCTGCCTG CCCTGTCAGG AGGCAGAAGG AAGCAGGTGT 
4901 GAGGGCAGTG CAAGGAGGGA GCACAACCCC CAGCTCCCGC TCCGGGCTCC 

49 51 GACTTGTGCA CAGGCAGAGC CCAGACCCTG GAGGAAATCC TACCTTTGAA 
5001 TTCAAGAACA TTTGGGGAAT TTGGAAATCT CTTTGCCCCC AAACCCCCAT 
5051 TCTGTCCTAC CTTTAATCAG GTCCTGCTCA GCAGTGAGAG CAGATGAGGT 
5101 GAAAAGGCCA AGAGGTTTGG CTCCTGCCCA CTGATAGCCC CTCTCCCCGC 
5151 AGTGTTTGTG TGTCAAGTGG CAAAGCTGTT CTTCCTGGTG ACCCTGATTA 
5201 TATCCAGTAA CAC AT AGAC T GTGCGCATAG GCCTGCTTTG TCTCCTCTAT 



3-651 



CATTCCAGGC CCGAAGCATG AACAGTGCAC CTGGGACAGG GAGCAGCCCC 
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5251 CCTGGGCTTT TGTTTTGCTT TTTAGTTTTG CTTTTAGTTT TTCTGTCCCT 

53 01 TTTATTTAAC GCACCGACTA GACACACAAA GCAGTTGAAT TTTTATATAT 

53 51 ATATCTGTAT ATTGCACAAT TATAAACTCA TTTTGCTTGT GGCTCCACAC 

54 01 ACACAAAAAA AGACCTGTTA AAATTATACC TGTTGCTTAA TTACAATATT 
5451 TCTGATAACC ATAGCATAGG ACAAGGGAAA ATA-AAAAAAG AAAJLAAAAGA 
5501 AJLAAAAAACG ACAAATCTGT CTGCTGGTCA CTTCTTCTGT CCAAGCAGAT 
5551 TCGTGGTCTT TTCCTCGCTT CTTTCAAGGG CTTTCCTGTG CCAGGTGAAG 
5601 GAGGCTCCAG GCAGCACCCA GGTTTTGCAC TCTTGTTTCT CCCGTGCTTG 
5651 TGAAAGAGGT CCCAAGGTTC TGGGTGCAGG AGCGCTCCCT TGACCTCCTG 
5701 AAGTCCGGAA CGTAGTCGGC ACAGCCTGG? CGCCTTrCAC CTCTGGGAGC 
5751 TGGAGTCCAC TGGGGTGGZC TGACTCCCCC AGTCCCCTTC CCGTGACCTG 
5301 GTCAGGGTGA GCCCATGTGG AGTCAGCCTC GCAGGCCTCC CTGCCAGTAG 
5851 GGTCCGAGTG TG TTTCATCC TTCCCACTCT GTCGAGCCTG GGGGCTGGAG 
5 901 CGGAGACGGG AGGCCTGGCC TGTCTCGGAA CCTGTSAGCT GCACCAGGTA 
5951 GAACGCCAGG GACCCCAGAA TCATGTGCGT CAGTCCAAGG GGTCCCCTCC 
5001 AGGAGTAGTG AAGACTCCAG AAATGTCCCT TTCTTCZCCC CCATCCTAC3 
6051 AGTAATTGCA TTTGCTTTTG TAATTCTTAA TGAGCAATAT CTGCTAGAGA 
6101 GTTTAGCTG? AACAGTTCTT TTTGATCATC TTTTTTTAAT AA.TTAGAAAC 
6151 AC C AAAAAAA TCCAGAAACT TGTTCTTCCA AAGCAGAGAG CATTATAATC 
62 01 ACCAGGGCCA. AAAGCTTCCC TCCCTGCTGT CATTGC7TCT TCTGAGGCC? 

62 51 GAATCCAAAA GAAAAACAGC CATAGGCCCT TTCAGTGGCC GGGCTACCCG 

63 01 TGAGCCCTTC GGAGGACCAG GGCTGGGGCA GCC7CTGGGC CCACATCCGG 

63 51 GGCCAGCTCC GGCGTGTGTT CAGTGTTAGC AGTGGGTCAT GATGCTCTTT 
6401 CCCACCCAGC CTGGGATAGG GGCAGAGGAG GCGAGGAGGC CGTTGCCGCT 

64 51 GATGTTTGGC CGTGAACAGG TGGGTGTCTG CGTGCGTCCA CGTGCGTGTT 
6501 TTCTGACTGA CATGAAATCG ACGCCCGAGT TAGCCTCACC CGGTGACCTC 

65 51 TAGCCCTGCC CGGATGGAGC GGGGCCCACC CGGTTCAGTG TTTCTGGGGA 
6601 GCTGGACAGT GGAGTGCAAA AGGCTTGCAG AACTTGAAGC CTGCTCCTTC 
6651 CCTTGCTACC ACGGCCTCCT TTCCGTTTGA TTTGTCACTG CTTCAATCAA 
67 01 TAACAGCCGC TCCAGAGTCA GTAGTCAATG AATATATGAC CAAATATCAC 
67 51 CAGC-ACTGTT ACTCAATGTG TGCCGAGCCC TTGCCCATGC TGGGCTCCCG 
5801 TGTATCTGGA CACTGTAACG TGTGCTGTGT TTGCTCCCCT TCCCCTTCCT 
6851 TCTTTGCCCT TTACTTGTCT TTCTGGGGTT TTTCTGTTTG GGTTTGGTTT 
6901 GGTTTTTATT TCTCCTTTTG TGTTCCAAAC ATGAGGTTCT CTCTACTGGT 
6951 CCTCTTAA.CT GTGGTGTTGA GGCTTATATT TGTGTAATTT TTGGTGGGTG 

Fig- 1 (cont'd 3) 
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7001 AAAGGAATTT TGCTAAGTAA ATCTCTTCTG TGTTTGAACT G AAG TCTGTA 
7051 TTGTAACTAT GTTTAAAGTA ATTGTTCCAG AGACAAATAT TTC TAG AC AC 



7151 TGAGAGGGGA GAGCTGAACA GATGACCCCT GCCCAGATCA GCCAGAAGCC 

7201 ACCCAAAGCA GTGGAGCCCA GGAGTCCCAC TCCAAGCCAG CAAGCCGAAT 

7251 AGCTGATGTG TTGCCACTTT CC AAG TC ACT GCAAAACCAG GTTTTGTTCC 

73 01 GCCCAGTGGA TTCTTGTTTT GCTTCCCCTC CCCCCGAGAT TATTACCACC 

73 51 ATCCCGTGCT TTTAAGGAAA GGCAAGATTG A7GTTTCCTT GAGGGGAGCC 

7 401 AGGAGGGGAT GTGTGTGTGC AGAGCTGAAG AGCTGGGGAG AATGGGGCTG 

7451 GGCCCACCCA AGCAGGAGGC TGGGACGCTC TGCTGTGGGC ACAGGTCAGG 

7501 CTAATGTTGG CAGATGCAGC TCTTCCTGGA CAGGCCAGGT GGTGGGCATT 

75 51 CTCTCTCCAA GGTGTGCCCZ G TGGGC ATTA CTGTTTAAGA CACTTCCGTC 

7 601 ACATCCCACC CCATCCTCCA GGGC TC AAC A CTGTGACATC TCTATTCCCC 

7 651 ACCCTCCCCT TCCCAC-GGCA ATAAAATGAC CATGGAGGGG GCTTGCACTC 

77 01 TCTTGGCTGT CACCCGATCG CCAGCAAAAC TTAGATGTGA GAAAACCCCT 

7751* TCCCATTCCA TGG CG AAAAC ATCTCCTTAG AAAAGCCATT AC C C TC ATTA 

7 301 GGCATGGTTT TGGGCZZZC^. AAACACCTGA CAGCCCCTCC CTCCTCTGAG 

7851 AGGCGGAGAG TGCTGAC TG T AGTGACCATT GCATGC CGGG TGCAGCATCT 

7901 GGAAGAGCTA GGCAGGGTGV CTGCZZZZTC CTGAGTTGAA GTCATGCTCC 

79 51 CCTGTGCCAG CCCAGAGGCC GAGAGCTATG GACAGCATTG CCAGTAACAC 

SO 01 AGGCCACCCT GTGCAGAAGG GAG CTGGC TC CAGCCTGGAA ACCTGTCTGA 

SO 51 GC-TTGGGAGA GGTGCACTTG GGGCACAGGG AGAGGCCGGG ACACACTTAG 

8101 CTGGAGATGT CTCTAAAAGC CCTGTATCGT ATTCACCTTC AG TTTTTG TG 

SI 51 TTTTGGGACA ATTACTTTAG AAAATAAGTA GGTCGTTTTA AAAACAAAAA 

52 01 TTATTGATTG CTTTTTTGTA GTGTTCAGAA AAAAGGTTCT TTGTGTATAG 
32 51 CCAAATGACT GAAAGCACTG ATATATTTAA AAACAAAAGG CAATTTATTA 

53 01 AGGAAATTTG TACCATTTCA G T AAACCTGT CTGAATGTAC CTGTATACGT 
83 51 TTCAAAAACA CCCCCCCCCC AC TG AATCCC TGTAACCTAT TTATTATATA 
8401 AAGAGTTTGC CTTATAAATT TA 



7101 



CAAACAAAAG CATTCGGAGG GAGGGGGATG 



G TG AC TG AG A. 
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Murine sequence of the non-coding RNA gene (including the 
putative promoter) 

1 CTTAGAGTTT CGTGGCTTCG GGGTGGGAGT AG TTGGAGC A TTGGGATGTT 

51 TTTCTTACCG ACAAGCACAG TCAGGTTGAA GAC CTAACC A GGGCCAGAAG 

101 TAGCTTTGCA CTTTTCTAAA CTAGGCTCCT TCAACAAGGC TTGCTGCAGA 

151 TACTACTGAC CAGACAAGCT GTTGACCAGG CACTCCCCCC AACAATATCC 

201 TCCCTCTTCC CCCCCCCCAC CCCCGCCCCG TGTGCTCGTT AGGG C AATTG 

2 51 AAAGGACACT CCCATTTTTG GTGCCATTGA TGCCCTGTCC ATAATAGCTT 

3 01 CCCTGACTTT TACACCACCC CAACTCCCAA TCTGAAGGAC TGGGAGGTGT 
. 251 GATGCAGGAG AAACTATGGG ACTCTTGGGA GAAGACTATG GAGTTGGCCA 

401 GTGATTAAGG CCCACTAATT CCAACTGTGG TAGCACAGAT CTGGCTCCAC 
451 ATCAACCCAA TCCAAAACTG ACAAGGATAT TTTGCAAAAA AAGAAAGTGG 
501 CACCTGTCTG ATCCAGCTCT GACATGGCTA GAGGTGAGTC CTAAACTGAT 



13 

SI 551 GG C TT AT AAA CTAGCCTGAG CCACAGAAGA G T ATGGC C C A. GAGTGAAGTG 

ni 
a 

HI £51 TGCCATGGAG GCAGCAGGAC AAAGTACAGG CAGGCTAGGT GGAGTCAAGC 

in 



5 01 TCATCATCTG TTCACAAGGC ATGCTCCCCT AGAAGATAAT GCTAAAGAGG 



7C1 CAGGCCTAGT GCCACAGAA.C AAGAGAGCAG TC TG AC TAG T AATTAAGAGG 

3 751 GAAGAAAGGA AAATATTCTT CCAATTACTT TCCAGTTCTC CTTTAGGGAC 

£3 

%J 85 1 AGCTTAGAAT TATTTGCACT ATTGAGTCTT CATGTTCCCA CTTCAAAACA. 

Q 901 AGCCACCAGA C^TGOO.GTG TTCAGAACTA CCTGTATCTG TATATACCTG 



851 AACAGATGCT CTGAAAGCAA ACTGGCTTGA AATGGTGACA CTGTCCCACA 



951 CGCTTGTTTT AAAGTGGGCT CAGCACATAG GATTCCCAAG AAGCTCCGAA 

1001 ACTCTAAGTG TTTGCTGCAA TTTTATAAGG ACTTCCTGAT TGCTTTCTCT 

1051 ctcgtccttc CATTTCTTCC ttccttccat ttcatgcttt catttcttcc 

1101 CCTAGCTTCT AGTTGTTTCT TCTGTTCCAG GCAGCTGCAG TGCTGAACCA 

1151 CATGGTTACC T AAC AG C AG T CAGCTGCAGC CCTAGGATTC TTCCTGCCCT 

1201 TTAACTTCCC ATTGCC AG TG CCAGGTATCA TATTTAACCT TGAGCAAGAG 

12 51 CTGGGCTCTT TTGAGCCCTC CCTAACCTCT GTGAAGAAGA AC AAG AAGG T 

12 01 AGGAAGCTCT TGCTCTTGCT AAGAAAAATG TCAAAAGGCT TTCAGACCTT 

13 51 AAACAATGAG CCTTTTCACC TTTTACTCTA GAAAAGTGGA C T AG AAAATC 
1401 TGGGTCACAT TGGGTAGCTG AAGGAGATAC AGAGGCCCCT ATGGC CTGCC 
1451 AG AG TCGTTG C ATGGC CCAA CAGGGGCTCC ATGCCCACTA CCCTTGACCC 
1501 TACTCAGAAA TCTAATGTCA TACTTAGTGT GGGCAGGGGA CCTGTCAGGA 
1551 CAGATGCAGA CCTAAGCAGG GAG TG AC AC C AGGGCCCTTG GCCCTTCTTC 
1601 TGACAAACAT ACACATCCCA AGTCTTTTTC TAGTGGAATT CTTAACCTCT 
1551 TGCTCACTGG GGACTGGGAA GCATCAGCAC ATCCGATATT TCAAACTCTG 
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1701 CTCCATAAGT ACAGTGGTGA ATTTTATAGA CTTGACTTTG CTGTGGGGTT 

1751 TTAATTGGTC AGTTTTAATT TGGGATCCCA AAGTTTTAAC CTCCATTCAG 

1801 GAAGTCCTTA TCTAGCTGCA TATCTTCATC ATATTGG TAT ATCCTTTTCT 

1851 GTG TTTAC AG AGATGTCTCA TATCTATCGA AATCTGTCTG AGAAGTACCT 

1901 TATCAAAGTA GCAAATGAGA CAGCAGTCTT ATGCTTCCAG AAACACCCAC 

19 51 AGGCACGTCC CATGTGAGCT GCTGCCATGA ACTGTCGAGT GTGTATTGTC 

2 001 TTG TGTATTT TCGTTAACGT TCCCCAGCTT CCTTCCTGCG GTGTAATCAT 

2 051 GGAAGAGTGA AACATCATAG AAATCGTCTA GCACTTCCTG GCCAGTCCTT 

2101 AGTGATCAGG AACCGTAGTT GACAGTTCCA ATTGATAGC T TAAGATAAAA 

2151 CCATGTTTGT CTCTTATGGA ATGGTTAGAA CTAAGTGAGA GATCTTGCCC 

2 201 CATTCTGTTT GCCGAATCAT AGTTGG ACTT TT AG TG T ATT TGTATCCATT 

2251 TCCTTGTGCT ATAAAAGCAA ACCCTGCAAC CAGCTTTCTG TCAGGCAGTC 

□ 

% n 2 301 CTTTTGCCTG CTCTGCTTTT GATCCTCTTA GTCTTGCTTC TGGTTCCTCC 

si 

r% l 2 2 51 CTGGAGAGGG AGGAG-GGGTC AGAAGAGGAA TTCTGGAGGA. TCCAGGATAT 

Q 2 401 GTCCTTCTGA ACTCCTGCTT CTTCCAGTGA CAAAAGGCCC CTACTGCCCC 

r« 

l = l 2 451 ACCCCAACCT GCCCCATGCA CTCCTCTAC-G ACACCTTTCC AT AC TTTTC A 



2 501 CAACACCTAG C C AGG TTG AC ACCAAGTTGT TTATTGTGGT CTGCTTGG AA 



r\ 2 551 TTTTACCTGT TAGGCTTACT TAG TCC AATC AAATGGACTC CAAGTTGGGT 

>J 2 501 ATCCCTCATC TTTGGAAGAC AAC C T AGG C T GATTAGATAT TTACTTTTGG 

M 2 551 GATTGCAGCA CTTZ'GGGTGC CGTTTTTCTT TTACTTGGGT TTTATCTGCA 

H 2701 GCTCCCTCAC CACCACCACC ACCCCCCACT TACCTGTATG TAGAACTGAT 

27 51 TTCAAAACTG GAGGTGGTGG TAACTGCAGC TTCTTAGGGT TTTCTTCACT 

2 301 TCTTGCT.TCT TTCCCCATTC CCTCATCCAC AAATAAGGGC ATCACAAGTC 

2 8 51 AGTCTCCTTT AAGCAGGCAG CTTTGGTGGG GTTTTTCCCC TGGAAGCCAG 

2 901 GGACCCTGTC AGGCTGCCTC TGCCTTGTGC- TC AGG TTG AC AGGAGGTTGG 

2 9 51 AGGC-AAAAGC CTTAAGTCA? GGGATTCTCA CCAGCTC-TGT CTGGCTCAGA 

2 001 CCTGGAATGT GACCTTTATT TTG TTG T ATT TG AAC A" TG T AAAGTGTGGG 

3 051 TGGTACCTTA AACTGAATAT GTGAAGAATC CAGAAACTGA CCAACAGCTT 
3101 TC AG AT AC CT GGGGCTAGGT CACTAAGGTC ACATCCAGTC TTCCCTACCC 
3151 TGTTCTAGTT GTTAGCTACT ACCTCTCCCA GATAGATTGC TGTATATCCT 
3 201 CCAACTATGA TCATCCTGGC CCAAGCTTGC CTGTTCTTGA GTCTGTCTTA 
3 2 51 AC C AG TGG AA CTGCTGCCCT TGGTGTGCAG TG AG TTG AGG ACTCTTGGTC 
3 3 01 ACAGCCAGGC TCTAGTAG T A CAGCTCCTTT CTGCTGGTGC TG TATTTCC A 
33 51 TATCAAAAGG CACAGGGGAG ATCTAGAAAT GCCATCTCCC CCAGTCCATC 
3401 AG TGCCAAAC AAGCCCATGA TCCCAGCATG GGTACAGACA ACTCTGTTCA 

Fig. 2 (cont'd 1) 
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3451 GTGCTATCAC AACAGACTAG AGGCCATGAA CATTGGACGT GGGAACCAGA 

3501 GCAACCCGAA TTGCTGCTGC TTTATTCAGC TTTCCGTTGC TCTGACAATG 

35 51 ATAAAACAAG GCAGTAACTT AAAACAGACT GCCAGGTTTG GCAGAGAAAG 

3 501 GAAATTCCTT AGCTGACAGC ACCTCTGGAT TTTAAATAGG TTGTAATAAG 

3 6 51 TGGCTCAAAC CCATCCAGGA AAAAGCAAAA GGGTTAGAAC TG AC C AG ATG 

3701 AGACCAGCCT GATTTCATGC AGCCCAAATG GAGTCCAGCT GTCTGAACTC 

3751 TGCAGCACTT C TC T AC T AC A GTCTCCTAGA GCATTCCAGC CAGGCTCTTC 

3 801 AGGCTGAGGA. GACATCACAG GTGCCAGTTC TTCAAGAAGA CTTTTGTGCA 

3 851 TCAGTTCATA GCCTATATCT TTGCCCAAGA TTGTAGATTC AGGTTAACAC 

3 9 01 TACAGATTCT AGGGCAGATG ACTGAG AC TC AGAA-AAAAAG CCCCTGTGGA 

39 51 C TGTGGT ATA GCGAAGTACA AAAACTGAAG GGGGCTAGGG CAGATGCCGC 
40 CI ATGCCTCATG C C AG AGCCAA GCCCTCTGCT CCATCCACAT CCTTTTCTGG 

40 51 CTCCTTCTTC CTGCTCTCTG CTTCAGTGAA CCAGCCCCAC TCTGAAGAGA 
4101 TTTGTTGATT CTCTCCATTT TTATGTCTTT CTCTTTTAGG TACTATATAG 
4151 AAAAGGC TTA GTCTAATTGT TATAAATTGC TAGAATACTG CCTCCCCCAG 
42 01 GG TC T AAAAA TA.T ATGC TAA AGGGGAAAAC TTGAACACTG AAACCAGTTC 

42 51 TGAACAATTT AGAAGGAAAA CCTTGAAAAC ATTTAACAAA AAATTATATT 
43 CI TTAATGTTTA TGAATAAGAG GAGGCTTTTG AAAAAATGTT GATCTATAAA 

43 51 TACTTACTTT AGGCCTGAGG TGTCTAATGA GTGAACTGAG C AATGGG AAC 
4401 TCAAGGCTGA AGCCTCCTGC ATCAGAGGAG GTAGAACCAG GAGCCTCTTG 
4451 AGATTTGAGG TG TTTTAGC A TTGGAAAGCC ACTCTTTGGG TAGCTGGCCC 
4501 CAGAAACTAC TTCTGACCTT GTCATTTGGA ATGGAGGTTA GTGGTCTGCC 
45 51 A.G ATGC C AAA GCTGCATGAG ACCAGCTCTT GG TTTATC AA TTTGAACACT 
4501 C AG T AAC C T A GAAGGCCCAG CACAAAGTGT CTGCTCTCTT CTTAACTGAG 
4651 CCTGCCCCAG CACTACTGCA CAAATTAGGG AGGGTCTACT TCCTACAGAG 
4701 CATCCCTCCC TGGGCCCCCT CCCATCCTTT GTACTCTACC TACCTGACCT 
47 51 TCAGGATCTT GGCACATA.CG AAATGGCTGT GTAGCAAGCA CTTTGGCATG 
4801 CCCTCCTAAA CTTACCCCAG AGCCTCTCCC TGCCTCCTTA AGCCAGTCTG 
4851 CCTGTCTTCT GGGGAGGTGT TAGAGCCCAT AGAATGGAGA GGAGAAAGAA 
4901 AAGAGGAAGA GGCAGGCAGG TAGTAAAAAG GCTCTGGGAG GAAAGACAGC 
4951 CTCCTAGGCT TTGCACAAGC AGGACTCAGC CCCTTGTGGG AACTAAGTGC 
5001 CATCTTGGAG TTTAAGAACA TTTGGACAAG TTGCAAATGA CCTTTGCTCC 
5051 TTGCTCCTCT CACCTTTTAT GGGGCCCTGC TTAGCACTGA AAGCAAATGC 
5101 GCTGAAAAGG CAAAGAGGTT TGGCTCCTGC CCACTGATAG TCCTTTCCCT 
5151 GCAGTGTTTG TGTGTCAAGT GGCAAAGCTG TTCTTCCTGG TGACTCTGAT 
5201 TAGATCCAGT AACTTAAGAG ATTTGTATGC ATAGGTCTGC TTTGACTCTT 
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52 51 CTATTCTGGG CTTTTGATTT GTTTTTCAGT TTTGCTTTTA GTTTTCCTAT 

53 01 TTTTATTTTA TGCACCAACT AGACACACAA AGCAGTTG AA TTTATATATA 

53 51 TATATATATA TATATATCTG TATATTTCAC AATTATAAAC TCATTTTGCT 
5401 TGTGACGCCA CACACACACA AAAAGAAAAA CCTTTTAAAA TTATACCTGT 

54 51 TGCTTAATTA CAATATTTCT G AT AAC CAT A GAGTAGGACA AGGGAAAAAA 
5501 TTTAAAAAAA AAAAAAAAAA AAGAAAAAAC ACATCTGTCT GCTGGTCACT 

55 51 TCTTCAATCC AAGCAGATCT GTGATC TTTC CTCGCGTCTT TCAAAGACTT 
5601 CCCTGTGCTA AGTGAAGGAA GCTCCAGGCT GCACCCAGGT TTTGTGCTTT 
5651 GTTTCTCCTC TGTTGTGAAA GGGGCCCCAA GATTCTGGGT ACAGGACAGT 
5701 TCATTTCAGC ATGGGG TC AG GAGACAAGAG CACTCCCTTT ACATGCTGAC 
57 51 GTACAGAACT TAGTGGGAAT AGCCTAGTCC CCACCTCTAC- GGATGGGGAG 
5801 CTAGCATGCA TGGGGGTG AC CCAACTCCCT CCACCTTTCC CTGGCCAGGA 

53 51 AGAGC CTG TG TACAGTAAGT CTGACAAGCT TTCCCCAGTT AGCAGGGCTC 

59 01 AGAGCATTTA AAAACCCTCC AAACTTTGCT GAGTCTAGGG ACTAGAGAGA 
5951 AGATAGAAGA TTTGGTCTAT CTCCAAGGTG TGTAAGCTGT A C C AG G TAG A 
5001 ATGCCAGGGA CCCCAGAACC ACATCCAACA GCCCAATGGG TCTCCTCCAG 

60 51 AAAGTAGTGA AGACTCCAGA AACATCCCTT TCTCTTCTCC CTGCTCCCAT 
£101 GAGTAACTGC ATTTGCTTT? GTAATCCTTA ATGAGCATTA TCTGCTAAAA 
5151 AAAAAAAATT AG CTG T AAC A GTTCTTTTTG CAAAAGGATC ATTCTTAAAT 

62 01 AATTAAAAAC ACCCCCCCCC CAAAAAAAAG TCCAGAACCT TGTTCTTCCA 
52 51 AAGCAGAGAG CATTATAATC AGGGCCAAAA TCTGTCCCAC ACCTCTACCC 

63 01 CATCTCCTCA TGATTGCTGC TTCTAAGGCC AGAATACAGC AAAGATATTT 
63 51 GTAGGCCCTT TGGGTGACTG GGCTACCCTT GGAGCTCTTG GAAGATGGGC 
6401 TGGGG AAGCC TC TG AG AC CC TATCCTAGGG CCTTC-CTCTA GGGAGTAATC 

54 51 AGTATTAGTA GAGTGTCACA AC ATT ATT CC CCAGCCGGCA TG AG ATGGGG 
6501 GCAGAAGAAG CCAAAGGGTT GTCTCCACTC- CTACTTACTT GGCCACTGAC 
65 51 AGG TAGGTG A CCATGTATGT CCATATGCAT GTTTTATGGC TGATGTGAGA 
6601 TCAGCACCCA AGTTAGCTTC ACCTGGTGAC CTCTAACCCT GCCTGGATGG 
6651 AGCAGGCCAC CTGGTTCAAT GTTTCTGGGC AGCTGGACAA TGGAGTGCAA 
6701 AAGGCTTACA GAACTTGAAG CCTTTTCCTT ACTTTGCTAG CACGGCCTCC 
67 51 TTTTCCATTT GATTTGTCAC TGCTTCAG TC AATAACAGCC GC TC C AG AG T 
6801 * C AG TAG TTG A TGAA.TATATG ACCAAATATC ACCAGGACTG TT AC TC AAC G 
6851 TGTGCCGAGC CCTTTCCTTG TGCTGGGCTC CCTGTGTACC TGGACACTGT 
6901 AATGTGTGCT GTGTTTGCTC TCCTTCCTCT TCCTTCCTTG CCCTTTCCTT 
6951 GTCTTTCTGG GGTTTTTCTG TTGGGTTTGG TTTGGTTTTA TTTTTCCTTT 
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7001 TGTGTTCCAA AtATGAGGTT TTCTCTACTG GTCCTCTTTA ACTGTGGTGT 
7051 TGAGGCTTCT ATTTGTGTAA TTTTTGGTGG GTGAAAGGAA CTTTGCTAAG 
7101 TAAATCTCTT CTGTGTTTGA AATGAAGTCT GTATTGTAAC TATGTTTAAA 
7151 GTAATTGTTC CAGAGACAAA TGCTTCTAGG TACATTTTCA TTACAAACAA 
72 01 AGCATTTGAA GGGAGGGAAG TGGTGAATAA GACAAGAGGG GCAATCTGAA 

72 51 TTGATCCCTG CCCAGATCA.G CCAGAAGCTA CCAAAAGTTA AGCACTGGTT 

73 01 TTCCATTCCA AGTCAAGAGA CTGAAGCTGA TGTTTTGCCA TTTTCAAAGT 

73 51 CAAAGCAAAA CCAGCTTTTC CACCCAATGG ATTCTTTGCT TCTCCTTCCC 
7401 AGATTATTAC TACTGCTGTA ATAATCTAGG AGTGCCAGGA GGGAAAGGAG 

74 51 TATTAACACA GAGCTGTGGT C AC TG AG TAT GGAAAGGCTT GGTCTG AG TT 
7 501 TTCAGGAGGA TGACCCACTG TGGACATGGG G AG AAG AC AG AAGATAAATT 

75 51 AGCCGCTCCC TGCCTAAGA? ACCTCTTAA? AGATAAGTCA AGGCCATGGA 
7 601 CATTATTGTC TACAAGGCAT GTTTCAAAGA CATGACCAGT- CAGGACACTT 

76 51 CTGTCATACT CCATGTTGCC CCCTAGTACA C AG T ACTAAT 4 CTGATATCTC 

77 01 TGTTCCCGCC ATGCCTGGGG GATAAAATGA TAG C AG AG AC TCCTTTCCTT 

77 51 CAATGTC-ATC TAATTCCCAA C.AAAATCTGG G CC TG AG AT A CCACCTGTTT 
7S01 C T A TGG C AAA CATC C T C AG T AAAGTGTTA.T TCTCATTGCA GATTGTTCCA 

78 51 GCCTAATGTA AGAGGAACAG AGCAGTGTTC CCTTGGAGCC TCATGTGGAC 
7 9 01 AGTTCTACCT GTAGTGACCA GTTGGCTATA G TAG TT ATT A G C TGG AACAA 

79 51 CCAGACAGGG T AC ATGCCCC CTCCAAAATC CATGTTGTAC TCCCCTCTGC 
3001 CAGCCAGGGG GGGTGAGATC TG TAG A-.T AG TGCAGCCAG? GACAAGCCAC 
3051 CTTGTGTTTG TCACCAGCTC AAAAACTCAT CTAAGGTTGG GAGCAGGCAG 
3101 ACAAGGCAGA GAGAAAGATC CAGGACAGAC CTAGCTGGGC TGGAGGGGTC 
3151 TTGAAAAGCC CTCTGTCGTA TTCACCTTCA GTTTTTGTGC TTTGGGACAA 
S201 TTACTTTAGA AAAT AAG TAG GTCGTTTTAA AAACAAAATA TTGATTGCTT 

82 51 TTTTGTAGTG TTCAAAACAA AAGGTTCTTT G TG T AT AG CC AAATGACTGA 

83 01 AAG C AC TG AT ATATTTAAAA ACAAAAGGCA ATTTATTAAG GAAATTTGTA 
63 51 CCATTTCAGT AAACCTGTCT GAATGTACCT GTATACGTTT CAAAAACACA 
8401 CCCCACTGAA CCCCTGTAAC CTATTTATTA T AT AAAG AG T TTGCCTTATA 
8451 AATTTACATA AAAA 
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CTTAGAGTTTCGTGGCTTCAGGGTGGGAGTAGTTGGAGCATTGGGGATGT 
TTTTCTTACCGACAAGCACAGTCAGGTTGAAGACCTAACCAGGGCCAGAA 
GTAGCTTTGCACTTTTCTAAACTAGGCTCCTTCAA 

ATACTACTG ACCAGACAAGCTGTTGACCAGGCACCTCCCC 

TC CAACAATATC 

TCCCGCCCAAACCTTTCCCC 

CTCCCTCTTC C CC CCCG G- 

GCGACAGAGCAGTTGAGAGGACAC 
--G---A A 



CCATCAGTGCCC 
-TGA 

rrCCCCCACTCCCAACCAC 
-A A . . T- 

GTTGGGACAGGGAGGTG TGAGGCAGG AGAGACAGTTGG ATTC 1'ITAGAG A 
TGAA T T A — TA-G C GG 

AG. . . ATGGATATCACCAGTGGCTATGGCCTGTGCGATCCCACCCGTGGT 
— ACT- GT — G AT — A . CACTA — T A-T 

GGCTCAAGTCTGGCCCCACACCA.GCCCCAATCCAAAACTG 

A — A — GA T T — .A A TAT 

TTCACAGGACAGGAAAGTGGCACCTGTCTGCTCCAGCTCTGG 
--TG--AA-A-A A A . 

GGAGGGGGGAGTCCCTTGAACTACTGG . GTGTAGACTGGCCTGAACCACA 
. . --A--T . . -A GA CT-A — A A G 

GGAGAGGATGGCCCAGGGTGAGGTGGCATGGTCCATTCTCAAGGGACG . T 
-A T A A T CA — TG A C-T-C- 

CCTCCAACGGGTGGCGCTAGAG GCGATGGAGGCAGTAGGACAAGGT 

--C-T-GAA-A-AAT A— AGGT C A-- 

G CAGGCAGGCTGGCCTGGGGTCAGGC CGGGCAG AGCACAGCGGGGTGAGA 
A A- . G A A A CT — TG-CA-A-AACA— . 

GGGATTCCTAATCACTCAGAGCAGTCTGTC^ 

ACTAG — A- 

GGCAAAGGGGGAGGAGAAGAAAATGTTCTTCCAGTTACTTTCCAATTCT 
. . -T A A — A-G A A G 

CTTTAGGGACAGCTTAGAATTATTTGCACTATTGA 



CTTCAAAACAAACAGATGCTCTGAGAGCAAACTGGCTTGA^ 
A A 

TTT AGTCCCTCAAGCCACCAG ATGTGACAG TG TTGAGAACTACCTGGATT 

C- . . A CA — G C T — C 

TGTATATATACCTGCGCTTGTTTTAAAGTGGGCTCAGCACATAGGGTTCC 



CACSAAGCTCCGAAACTCTAAGTGTTTG^TGCAATTTTATA^ 

— A 



TGATTGGT 
C- 



CCCTTCCATTTCTCCCTTTTGTTCATTTCATC 
-CTCGT T CCT-C G 



CTTTCACTTCTTTCC 
T c AG- . 



TAGTTCATCCCTT 
. -T . G-TT 



CTCTTCCAGGCAGCCGCGGTGCCCAACC ACACTTGTC 

— G T— A TG ACATGGTTACCTA GCA 

GGCTCCAGTCCCCAGAACTCTGCCTGCCCTTTGTCCTCCTGCTGCCAGTA 
A G— - . T--C-T— T AA-T CAT G 



CCAGCCCCACC 
. .GT-T-A-A- — 



AGCCCTGAGGAGGCCTTGGGCTCTGCTGAGT 
. -A— T C-A-AGC TT C 



CCAACCTGGCCTGTCTG . TGAAG AGCAAG AGAG CAGCAAGGT 
--TC AA C-G--AA A AG-T — G C 

CCTAGGTAGCCCCCTCTTCCCTGGTAAGAAAAA . . GCAAAAGGCATTTCC 
— C TGT . A 

C^CCCTG AACAACGAGC CTTTTCACC CTTCTACTCT 

G T-A T . T A A 

GAGGAGCTGGGarCGATTTGGTAGTTOAGGA 

— AA-T T-AC G C A-G -GAT . A- 

GGCCTGCC . . AGTCATCGAGTGGCCCAACAG^^GCTCCATGCCAGCCGAC 
AG -G-T-CA CA-TAC- 

CTTGACCTCACTCAGAAGTCXJU1AGTCT 

CT A — T-AT ATA-T T . . 

GCGCTACCAATG<IAGAACTCCCAAGACCCG^ 

. . — G TG -CAG — CAGATGC TA A GTGAC A 

CCCAGCCCTTCCTCTGCIX:CCCCU" r in XX CTCGGA C; ^ 
-TTG T ACAAA-A-ACA-ATC-CA CT--T-CT-G- 

GGCAATGTTTTGCTTTTGCTCGATGCAGACAGG . . . GGGCCAGAACACCA 
— A-T-C — AAC — C AC — GG T — GAA-CAT C T-C 

CACATTTCACTGTCTGTCTGGTCCATAGCT 

— T AAC C AG-ACA CT-AATT T— A 

CATGGGC ITGCTGTCGG TTTTT AATTG ATCAG TTrTCATGTGGGATCCCA 
-T — ACT -G G A— T 
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1781 ^^" rrr ^ CCTC ^ n ^ 



1899 ATATTGGTATAT 
1831 



GTTTACAGAGATGTCTCTTA . . TATCTA 



1947 AATCTGTCCAACTGAGAAGTACCTTATCAAACTAGO^ 
1997 TCTTATGCTTCCAGAAACACCC^ 



2047 ATGAACTCTCAAGTGTGT 
1977 G 



TATTTCAGTTATTG . TCCCTG 
— TC AC-T CA 

2096 GCTTCCTTACTATGGTGTAATCATGAAGGAGTGAAACATC^ 

2027 C — GC G-A TC- 

214 6 TCTAGCACTTCCTTGCCAGTCTTTAGTGATCAGGA 

2077 G C G 

2196 TCCAATCAGTAGCTTAAGAAAAAACCGTCTTTG 

2127 TGA A A A 



2246 AG AAGTGAGGGAGTTTGCCCCGT 

2177 — AACT A — TC A- 



TAGAGTCTCATAGTT 
CC — . .A 



2292 GGACTTTCT A GCATATATGrrGTCCA rriXXri 'ATGCTGTAAA^ 
2225 . TG T A- 
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-TG- 

CTGCAACCAAACTCCCATCAGCCCAATCCCTG^ 

GCT-T-TG G-. -GT TTG- 

CTGCTCTGCTCATGACCCCCCCAGCTTC^ 

TT T — T-TT — TC-TG GT — C TG-AG- 

GGGAAGGGGGGTCAGAAGAG AGGGTGAGTCCTCC 

G-A GAATTCTGGAGGATCC A-AT T- 

AGAACT CTTCCTCCAAGGACAGAAGGCTCCTGCCCCCATAGTGGCC 

T CCTG T GT A C A-TG — CC- — 

TCGAACT . . . CCTGGCACTACCAAAGGACACTTATCCA . CGAGAGCGCAG 
C-A-C — GCC--AT C-TCT C-T TACTTTT-A — A 

CATCCGACGAGGTTGTCACTGAGAAGATGTTTATT^ . TTGGGT 
— C-TAG A . - .C T G T-C AA 

TTTTATGTATTA .... TACTTAGTCAAATGTAATGTGGCTTCTGGAATCA 
CC-G GGCT C CA 

TTGTCCAGAGCTGCTTCCCCGTCACCTGGGCGTCATCT^ 
-AC AA — TGGG - ATCCC -T-G 

AGGAGTGCGTGGCCCACCAGGCCCCCCTGTCACCCATGACAGTTC^ 

- — AC-T 

GGGCCGATCGGGCAGTOSTG<n*TGGGAACACAGC^ . ACT 

A T TA-AT-T-TACTT TTG C — TGG-T-C-GTT- 

TTATTTCATTCGGGCCCCACCTTGCAGCTCCC^^ 

— C T-C-T TTTT-T . . -CCAC — CCAC-A-C 

GCCTCTTTCCCT TCCAGTTTATTCCAGAGCTGCCAGTGGGG . . . C 

C — CAC — A GTATG - AG - AC -G T — A-A AG T-GTAA 



(nXSAGGCTCCTTAGGGTTTTCTCTCTATTTC 

CA T . . TC-C TTG CT C 

CCAAA GG CA.TCACGAGTX1AGTCGCCTTTCAGCAGGC 

-A TAAG A T A 



AGCCTTGG . CGGTTTATCG^CCTGXXAGG^JWKXMCCGTGCAGCTCTCAT 
T TG — G--T-TC A — C A G 

GCTCCCCCTCCCTTGXSGGTCAGCra . AAAGCCT 



TAAGCTGCAGGATTCTCAC 
TCATG 



AGTTTTGGGGTCTGA 
--T T ACC AA-G 



CCTCAATTTCAATTTTGTCTG^ACTTGAACATTATGAA . . GATGGGGGCC 
— -TT T G.A. . GT.TG. .T. .TA 

TCTTTCAGTGAATTTGTGAACA. . GCAG . AATTGACCGA CAGCTTTCCAG 
C AA-C A G-ATC A — C A AGA 

TACCCATGCGGCTAGXn^ATTAAGGCCACATCCACAGTCTCCCCCACCCT 



TGTTCCAGTT G TTAGTTACTACCTCCTCrC^^ 

T c TC-CAGAT-C-T-G A — C- 

CTGAGCTCrCCCCAGGTCTACCrCTCCCGGGC^ 

-C-A — AT -A — AT — TGG AAG-T T-CT — A-TC — 

TCATAGCCAGTGGGAT 
— T--A A-C- 



TTGGTCACAGCCAGGCGC . . -TA 
T-TAG T- 



ACAGCICAGTGAGCTGGAGATAC 
- . GT-TG T — AG — CT- 



TTCCCATATCAAAAGXSCACAGGGGACACCCAGAAACGXTCACJITC 

T G-T-T T TC G 

T?f^T^!T???!^?A^^C^T^A^ 
GGCGGAAGCirXTTACrrCGTGAGCGCCAGT 

TTGGCTGGCATCATTCCAGGCCCGAAG . CATGAACAGTGCACCTGGGACA 
— CA CT CAA A-TA — G-C -T — G — G AC 



Fig. 3 (1) 
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GGGAGCAGCCCCAAATTGTCACCTGCTTCTCTGCCCAGCTTTTCATTGCT . 
CA A G — T- G G — T-ATT C-G 

GTGACAGTGATGGCGAAAGAGGG7AATAACCAGACACAAACTGCCAAGTT 
C A . . .A OA — C-G TTA-A G G 

GGGTGGAGAAAGGAG Tl'l^rrTTAGCTGACAG AATCTCTGAATTTTAAATC 
T--CA AA C C-C G A 

ACT. .TAGTAAGCGGCTCAAGC CCAGGAGGGAGCAGAGGGATACGA 

GG-TG — A T A-CCAT AAA A -A — G-TA — 

GCGGAGTCCCXTGCGCGGGACCATCTGGAATTGGTTTAG 

A-T— -CAGAT-A G-CT — T — CA-GC A 

CCTGACAGCCAGAACTCTGTGTCCCCCCTCTAACCACAG^ 
T-CAG-T-T-T CAG-A-TTC . — T .... -C-C-T- 

GAGCATTCCAGTCAGGCTCTCTGGGCTGACT^^ 

C TCA A--CA-C 

GTACCA G TTCTTT A AGAAGAT Cl " X T GG GCATATACATyr i T A GCCTGTGT 
„ G C CT T-- . . — CAG— CA A-A- 

CATTGCCCCAAATGGATTCCrrGTTTCAAGTTC^ 

-T A- GTA-A G A —A 



ACCTGTGTCCTAGACT TCAGGGAGTCAGCTGTTTCTAG 

G-AGA— A-TG CAGAAAAAAAGCC-CT-TG-A-T-TG— A-AGC- 

AGTTCCTACCATGGAGTGGGTCTGGAGGA CCTGCCCGGTGGGG 

-AG-A-A-AA-CT— AG G — A-G-C-GATGCCG-A TCA— CCA 

GGGCAGAGCC . . CTGCTCCCTCC GGGTCTTCCTACTCT 

-A— CA CT A ACATCCTTTTCT — C — C-T — T- . — 

TCTCTCTG * CTCTGACGGGATTTGTTGATTCT 

G CTTCAGTGAACCAGCCCCA A-A 

CHT CATTTTGGT<J11w w l"ri VI V1'*1'TTAGAT ATTG T ATCAATCTTTAGAAAA 

TA G — C-A — 

GGCATAGTCTACTTGTTATAAATCGTTAGGATACTGCCTCCCCCAGGGTC 



T AAAATTACAT ATT AG AGGGGAAAAGCTGAACACTG AAG TCAGTTCTCAA 
A- -T — GC — A CT AC G-- 

CAATTTAGAAGGAAAACCTAGAAAACATTTGGCAGAAAATTACATTT^ 
T AA — A T TAA 

TGTTTTTGAATGAATACAAGCAAGCTTTTACAACAGTC 

AG — G-G GA — A-A — T T 

TACTTAGCACTTGGCCTGAGATGCCTGGTGA 

. -TT-A G — T — AA TGAACTGA T 

TCTGGAGGTACCCGACC 

GAGATTTGAGGTGTTTTAGCATTGGAAAGCCAC TTG T-G— 

TCAGCACATGGCTTCTGAAUJT G TCTTTTGG GAGTGGTATG 

CC- . — A-CTA C-T A AATGGAGGTT C— 

GAAGGTG GAGCG 

CC— A- -CCAAAGCTGCATGAGACCAGCTCTTtKTTTATCAATTT — A-A 
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TTCACCAGTGACCTGGAAGGCCCAGCACCACCCTCCTTCCCACTCTTCTC 
C- . . . A A A-AGTGT — G- 

ATCTTGACAGAGCC^CCCCAGCGCTGACGTGTCAGGAA 

T A— T A — — TGCACA 

AACTAGGAAGGCACTTCTGCCTGJUJGGGCAGCCTGCCTT - .GCCCACTCC 
— T G GT — A — T ACA-A T — CT — C-GG C 

TGCTCTGCTCGCCT CGGA 

CATCCTTTG-A A-CTA GACCTTCAGGATCTTGGCACATA — A- 

TCAGCTGAG CCTTCTGAGCT GG 

ATG T-TAGCAAGCACTTTGGCATGC C--A-A— TACCCCAGA- 

CCTCTCACTGCCTCCCCAAGGCCCCCTGCCTGCCCT 

C TT C-AGT T-T-CTGGGGAGGTGTTA 

GTCAGGAGGCAGAAGGAAGCAGGTG 

GAGCCCATAG AATGGAGAGGAG AAA - AA - A A G-C-G A 

TGAGGGCAGTGCAAGGAGGGAGCACAACCCCCAGCTXX^C^ 
GT-AAAAG-CT-TG A -AG G— T--TAGG- 

CGACTTGTGCACAGGCAGAGCCCAGACXXTGGAGG AAATCCTACC 

-T A GA-T C T-T — GAACT— G-G-C-T- 

TTTGAATTCAAGAACATTTGGGGAATTTGGAAATCT^ 

— G — G — T AC — G C GA-C — TG-T — TTG- 

TACCTTrAATCAGOTCCTGCTCAGCAGTGAGAGCAGA 
. . . — TC T— GG— C T C A A- 

TGAGGTGAAAAGGCCAAGAGGTTTGGCrCCTGCCCA 

— C-C A .T T- 

CCCCGCAGTGTTrGTGTG'rCAAC 



TGATTATATCCAGTAACACATAGA . . . CTGTGCGCATAGGCCTGCTTTGT 
G TT- . GATT AT T A 



CTCCTCTATCC 



TA Uri - nX.L T ri T AG TTTT 



TCTCTCCCTTTTATTTAACGCACCGACTAGACACACAAAGCAG 
C--A-. . . T-T A 
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TTTATATATATATCTGTATATTGCACAATTATAAACTC 

TATATATATATATA T . 

ATTTTGCTTGTGGCTCCACACACACXAAAAAAG ACCTGTTAAAATT 

A-G C AAAA T 

ATACCTGTTCCTTAATTACAATATTTCTC^ 



GGAAAATA - AAAAAAGAAAAAAAAGAAAAAAAAACGACAAATCTGTCTGC 
A-TTT A A G AA C 



CGCTTCTTT 



TGGTCACTTCTTCTGTCCAAGCAGAT 
AA CT- 

CAAGGGCTTTCCTCTGCGAGGTGAAGGAGG CTCCAGGCAGCACCCAGG TT 
A-A C T-A A T 

TTGCACTCTTCrrTTCTCCCGTGCTTGTGA 

TG— . TC — . G— C A 

TGCAG GAGCGCTCCCTT 

-A GACAGTTCATTTCAGCATGGGGTCAGGAGACAA A 



-C — A-A T G--A-T A C- 



TCT GGGAGCTGGAGTCCAC 

AGGGATG A-CA-G — TG A — CA T- . . . - 



CCCTTCCCGTGACCTGGTCAGGGTGAGCCCATGTGGAGTCAGCCTCGCAG 
A T- C AA TG AC A — T — GA — A 

GCCT . . CCCTGCCAGTAGGG . TCCGAGTGTGTTTCATCCTTCC . CACTCT 
— T-TC A-TT — C C — A CA-T-AAA-A — C AA T- 

GTCGAGCCTGCGGGCTGGAGCG^GACGGGAGGCCT^ . 
-CT T — A A — A A-A TA-A — ATT T — A CA-G 

ACCTGTGAGCTGCACCAGGTAGAACGCCAGGGACCCCAGAATCAT^ 
GTG A T T C- -CA-C - A 

TCAGTCCAAGGGGTCCCCTCCAG . GAGTAGTGAAGACTCCAG AAATGTCC 
A C T T AA CA 

CTT. . TCTTCTCCCCCATCCTACGAGTAATTGCATTTGCTTTTGTAATTC 
TC TGC C-T C C- 

TTAATGAGCAATATCTGCT AG AGAGTTTAGCTGT AACAG TTCTTT 

T AAAAA-A-A-AA 

TTG ATCATCTTTTTTTAATAATTAGAAACACC AAAA 

CAAA-GG A — C — A A CCCCCCCCAA 



AAATCCAGAAAC 



CCAAAGCAGAGAGCATTATAATCACCAGGG 



CCAAAAGCT . TCCCTCCCTGCT GTCATTGCTTCTTCT 

T — G A-A-CT — ACCCCATCTCCTCA-G G 

GAGGCCTGAATCCAAAAGAAAAACAGCCATAGGCCCTTTCAGTG 

A A .-C-GC G-T-TTTG GG A-T 

CTA(XKX5TGAGCCCTTCGGAGGACCAGGGCTGG 

T— GAG-TC-T A — . .T A A-A — C 

CATCC . . GGGGCCAGCTCCee*JUlXjlXjI*rCAGra 

T TA C-TT TA-G-A — AA A T — A-T CA 

ATGCTCTTTCCC A CCCAGCCTGGGATAGGGGCAGAGGAGGCGAGGAGGCC 
-CAT-A— C G — G — A — A G A— A— C-AAG — TT 

GTTGCCGCTG ATGTTTGGCCGTPGAACAGGTGGGTGTCTGCGTGCGT 

— CT— A CTACT-AC ACTG A A-CAT — AT— 

CCAa/lX^^l^TTTTCTGACriXI^CATGAA^ 

TA A A — G TG G AG-A A : --T— 

ACCCGGTGACCTCTAGCCCTGC*CCGGATGGAGCGGGGCC 



3CTGGACAGTGGJW3TGCAAAAGGCTTGCAGAACTTGA 
^CTTGCTACCAOSGCCTCC . TTTCCGTTTGATTTGTC 



A- 
AGC 



ACTGCTTCAATCAATAACAGCCGCTCCAGAGTCAGTAGTCAAT^ 

TGACCAAATATCACCAGGACTGTTACTCAATGTGTC - 

C — - T — T 

CATGCTGGGCTCCC . GTGTAT^TGGACACTGTAACtrTGTGCTGTGTTTGC 
TG T C T 

TCCCCTTCCCCTTCCTTCTTTGCCCTTTACT^ 
TGTTTGGGTTTGGTTTGGTTTTTATTTC^^ 

GGTTCTCTCTACTGGTCCTC . TTAACTGTGGTGTTGAGGCTTATATTTGT 

GTAATITITGG TGGGTGAAAGGAATTTTGCTAAGTAAA1X.' , 1'C'1'1 1 TGTGT 

TTGAACTGAAGTCTCTATTGTAACTATGTTTAAAG^ 

CAAATATTrCTAGACAC''l > 'l'l''l"l > C TTT ACAAACAAAAGCATTCGGAGGGAG 
GC GT— A A T-A 



Fig. 3 (2) 
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7134 
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M28 
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"*8i89 
==8228 

1=8238 
8276 

1 8288 
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8338 
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8422 



GGGGATGGTGACTGAGATGAGAGGCGAGAGCTGAACAGATGACCCCTGCC 
— AAG A-A CA CA-T . . .T T 

CAGATCAGCCAGAAGCCACCCAAAGCAGTGGAGCCC^ 
T A — - . . TA A-T — TT-T T 

AAGCCAGCAAGCCGAATAGCTGATGTGTTGCCACTTTCCAAGTCACTGCA 
T — AG-GA-T — . . T T A AA 



GCCCAGTGGATTCTTGTTTTGCTTCCCCTCCCC 

CCGAGATTATTACCACCATCCCGTGCTTTTAAGCAAAGGCAAGATTGATG 
. . . T — G-A- 



TTTCCTTGAGGGGAGCCAGGAGGGGATCTGTGTGTGCAGAGCTGAAGAGC 
. . -AA-CT---A-T A-A- . -A — A-TA-C- C 

TGGG GAGAATGG . . . GGCTGGGCCCACCCAAGCAGGAGGCTGGG 

— T-CTCACT T AAA T — T-TGAGTTTT A — AC 

ACGCTCT . GCTGTGGGCACAGGTCAG . . GCTAATGT TGGC 

C -A- -G-G-ACA G-G-A-A AA- A AT-AGCCGCTCCC — C- 

AGATGCAGCTCTTCCTGGA . CAGGCCAGGTGGTGGGCATT . CTCTCTCCA 
TA-GAT-C "-AA-A — TA — T-A CCA A AT-G A— 

AGGTCTGCCCCTGTrcjGGCATTACTGTTTAAGACA 

CA— TTT-AAA-A G — CAG-C-G T T-CT T 

CCCATCCTCCAGGGCTCAACAC - . . TGTGACATCTCTATTCCCCACCCTC 
GTTGC — C-T — TA-A — GT — TAA-C T G - 

CCCTTCCCAGGGCAATAAAATGACXAra 

G— A-G — T GG TAGCA ACTC T- CA 

CTGTCACXrCGATCGCCAGCAAAACTTAGATGTGAGAAAACC^ 

A G-T-TA — TC A TC-G-GCC T-C-A . . .GT- 

TCCATGGCGAAAACATCTCCTTAGAAAAGCCATTACCCTCATTAGGCATC 
— T A— C- C--T .TG TT GCAG--T 

GTTTTGGGCT CCCAAAACACCTGACAGCCCCTCCCTCCTCTG 

CCA-C--AATGTAAGAGG--C-G-G-A-TGTT T-GGAG . . 

AGAGGCGGAGAGTGCTGACTGTAGTGACCA . TTG CATGCCGGGTGCAGCA 
. . -T-T C T — AC G GC-ATA-TAGTT-TT- 

TCTGGAAGAGCTAGGCAGGGTGTCTGCCCCCTCCT^ 

G C-A-C--A •■ ACA AA-A-CC-T- -TG-A- 

TCCCCTGTGCCAGCCCAG AGGCCGAGAGCTATGGACAGCATT - . . GCCAG 
C . G — GG-T A-C — T-G-AT-G-CCA 

TAACACAGGCCACCCTX7TGCAGAAGGGAGCTGGCTCCAGCCTGGAAACCT 
. G A T TT — T-A TCAA TC 

GTCTGAGGTTGGGAGAGGTGCACTTGGGGCACAGGGAGAG . GCCGGGACA 
A A CA- . GACAA G — A — A — AT — A 

CACTTA GCTGCAGATGTCTCTAAAAGCCCTGTATCGTATTCACCT 

G--C — GCTGG GG TG C-G - 

GTTTTGGGACAATTACTTTAGAAAATAAGTAGGTCGTTT 

TAAAAACAAAAATTATTGA TT U C 'I T TTTT G TAGTGTTCAGAA - AAAAGGT 
__. . A--C 

TCTTTGTGTATAGCCAAATCACTGAAAGCACTGATATATTTAAAAACAAA 



AGGCAATTTATTAAGGAAATTTGTACCATTTCAGTAAACCTGTCT 



TACCTGTATAC<3TTTCAAAAACACCCCCCCCCCACTGAATCCCTGTAACC 
-A C 



TATTTATTATATAAAGAGTTTGCCTTATAAATTTA 



Fig. 3 (3) 
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sequence-conserved high-energy sequence 
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schim 

orang 
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hamst 
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^rat 
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Wchim 

f(|l:ang 

1 makak 

fnamst 

:mpuse 

Trat 

•kanga 

iiuman 

,schim 

forang 

joakak 

|iamst 

Indus e 

xat 

3canga 

human 
--schim 
parang 
-;-.iakak 

hamst 

mouse 

rat 

kanga 



human 

schim 

orang 

makak 

hamst 

mouse 

rat 

kanga 



human 

schim 

orang 

makak 

hams t 

mouse 

rat 

kanga 

human 

schim 

orang 

makak 

hamst 

mouse 

rat 

kanga 



TTGCTGCAGATACTACTGACCAGACAAGCTGTTGACCAGGC^ . 



. CCCCCATGTGGTCGT 



— T 



■ — . . C. . -AATA. . . -TC A C— A- 

- — TC CAA-AATATCCT-CC-CTTCCCCCCCCCCACCCCCG .G C 

- — TC ACAA-AA-A - CC - - C - CCCTCCTCACC CCACCCCTAT TG C— A- 

- TTT - TAGGGTA - A — AGC GC — T TCATC-C- 



101 



-T . , 

. .G— . . 
.-G— . . 



A T T C-T-. . 

-A A A T TGA T C-T-A'! 

-A A T CTGA T C-T- . . 

G AGGGA-GGT — AC -G-T — T - -CT — A- A — TAA- AG AGT AG -G - TAGTGG - AG - TTA - ATTTT - AGTG — 



. GCTCCCCCAGCTCCCCCC . 


- ACCTCCCCC 






\ T TGA — TTTA- . . 

. T TGA — TTTA- . . 


• A A 

• A A 



201 

ACTCCCAACCACGTT 



-T-TGA. . 
-T-TGA. . 
-T-TGA. . 



GGGACAGGGAGGTGTGAGGCAGGAGAGACAGTT 



A T 



T 

C 



— A-A-T-T-G 

A— TA-G 

A — TA-ATA 



-T-AA — TT-TA — CCAA-GTCTTA-AT-A-T-T-TT-AG-G-TTTT — . 



GGATTCTTTAGAGAAGA. . . TGGATATGACCAGTGGCTATGGCCTGTGC 
T 

C . . . C 

G C 

. CTA. - — GG A — G CA.-T 

C GG CTA GT — G AT — A CA\ CT 

-C GG CTA -A — GT— G-A AT-GC-C — CA. -T 

. CCCT GG-GCC.G — GGG--GGGG-A — G- . — ATT A- . . . 



301 

GATCCCACCCGTGGTGGCTGAAGTCTG^CCCCACACCAGCCCCAATCCAAAACTGGCAAGG^ 











A— TAG-A-T —A T-CA . 




-TAT — TGA-AA CA-T A 

-TAT — TG — AA-A-A A 



AT TTAGGAAA-A — G-TG — A-A-A-AG — G-G--CTGAGC-GTTGGC A. . GA-C-TGACTAGGG-CC-G T- .A AA 

401 

AGCTCTGGCATGGCTAGGAGGCGGGAGTCCCTTGAACTACTGGGT . GTAGACTGGCCTGAACCACAGGAGAGGATGGCCC\GGGTGAGGTGGCATGGTCC 



T--A-G T-A-T. . 

--A AG-T-A-T. . , 

AGAT-A-T . . , 

.CAAGG CCA— T-A-T. . , 



— . G A-TAC T G A T A A — A- A- -AT CAC-T 

— . -A GA C-TA— A A G A T A A T CA— T 

-T.-A GT C-TA — A G-A A A--A-A T CA— T 

. A— AGGG-GGG — AAGAC-T-A-A- AAGGA-TAG . . AA-C A— T CC-A-A-AA-AGC T . 



501 

ATTCTCAAGGGACG . TCCTCCAACGGGTGGCGCTAGA * GGCCATGGAGGCAGTAGGACAAGGTGCAGG CAGGCTCGCCTGGGGTCAGGC CGGGCAG 

. C C 

. AA. . . C C 

m GTT A GA. . . CA CA C- 

GC--A C-T-C-T-A-T-GA-AA-AATT A-GAGG . T C CA — A T GTG A A-A-CT 

G A C-T-C C-T-GAA-A-AAT A-GAGG. T C A — A A-GTG-A-. A A CT 

G A C-T-C C-T-GAA-T-CAT A-GAAG . T C A — A -A-GTG-A- . A A CT 

— AACC-TAC — A GGA-T--A-TTG-A-GAGGCCC-T A-TCCCC-ACCACCAA-A AT-T — A-C-GCA- -T . . — TT 

601 

AGCACAGCGGGGTGAGAGGGATTCCTAATCACICAGA TAGTGGACAGGGGAGGGGGCAAAGGGGGAGGAGAAG 



G 

-A CGT- . . - 

— TG-CA-A-AACA A-CAAT — G . TG 

— TG-CA-A-AACA A-CAGT-- . . .G— 

— TG-CA-A-A-CA A-CAGT — C.TG . . 



- -GGTAGTTAGGGACTC - 
-CGGTAGTTAGGGACT- 



. -G-G- 



. — T TA-G 

.— T-A-TAAGA. . . 
-T TAAG- 



-T TG— AC -A .GC 



-T A--A-T- 

A--A-G- 

A--A-GA 



CATTT--T — ACCTT - T - TATA - - TGGGTGTG - ATGCAC -TAGATA A-TGA — A-GA 



701 

AAAATGTTCTTCXAGTTACTTTCCAATTCT . . . CCTTT AGGGACAG CTT AGAATT ATTTGCACT ATTGAGTCTTCAT 



ACT 

A A G ... 

A A G ... 

A A— G G ... 

C CT G-G— ACT AGC 



GTTCCCACTTCAAAACAAA 



801 

CAGATGC 



G C-G-C A T — GA TCCCA 

TCTGAGAGCAAACTGGCTTGAATTGGTGACA TGACAGTGTTGAGAACTACCTGGATTT 



-A- 
-A- 



-A- 
-A- 
-A- 



-C- 
-C- 
-C- 



-A- 
-A- 
-A- 



-CA 
-CAA 



--G- 
— G- 



C— 



A- 

C- 

T— C- 

T — A- 



-T- -CTGAGATG-TCA-C- -A- - 
901 

GTATATATACCTG . 



C G A GCCC - TG -CACTTA-TTA CACTGGTG TG-G TT C AT 

Fig. 5 _ _ 



-CCTG. , 



CCTG. 

GG CTG- 
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Partial sequence of the non-coding RNA gene from hamster 



1 


TTGCTGCAGA 


TACTACTGAC 


CAGACAAGCT 


GTTGACCAGG 


CACCCCCCCA 


51 


ATACTCCCCC 


AATGTGCTCA 


TTAGAGATAG 


CAGTTGAGAG 


GACACTCCCA 


101 


TTTTTGGTGC 


CCTGTCCATA 


GCTTCCCTGA 


CTCTTCCACC 


ACCCCAACTC 


151 


CCAATCTGAG 


GGACCGGGAG 


GTGCGAGGCA 


GGAAAAATAT 


TGGATTCTTT 


201 


AGAGAAGACT 


AG AGGTG AC C 


AGTGACTGTG 


GCCCAGTAAT 


TAGAACTGTG 


251 


GTGGCACAAG 


TCTGGCCCCA 


CATCCACCCA 


ATCCAAAACT 


GATAAGGATA 


301 


TTTTGAAAAA 


CAGGAAAGCA 


GTACCTGTCT 


GATCCAGCTC 


TGGTATAGGT 


351 


AGGAGTGAGT 


CCTGAACTGC 


TGGATTACAG 


ACTGGCTTGA 


GCCACAGAAG 


401 


ATGATGGACC 


AGAGTAAAGT 


ATCATCACCT 


GCTCACAAGG 


CATGCTTCAC 


451 


TAGAGAATAA 


TTCTAAAGAG 


GTGCCATGGA 


GGCAGCAGGA 


CAAGGCACAA 


501 


GCAGTCTGGG 


TGGGGGTCAA 


GCCAGACCTA 


GTGCCACAGA 


ACAAGAGAGC 


551 


AATCTGTGAC 


TAGTAGTTAG 


GGACTTTGTG 


GATGGGACAA 


GGGGCATGGG 


601 


GGAAGAAATG 


AAAATATTCT 


TCCAATTACT 


TTCCAGTTCT 


CCTTTAGGGA 


651 


CAGCTTAGAA 


TTATTTGCAC 


TATTGAGTCT 


TCATGTTCCC 


ACTTAAAAAC 


701 


AAACAGATGC 


TCTGAAAGCA 


AACTGGCTTG 


AAATGGTGAC 


ACTTTGTCCC 


751 


ACAAGCCACC 


AAATGTGGCA 


GTGTTTAGAA 


CTACCTGGAT 


CTGTATATAC 


801 


CTG 











Fig. 5a 
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Partial sequence of the non-coding RNA gene from kangaroo 



1 


TTGC TGC AT A 


TACTACTGAC 


C AG ACAAGC T 


GTTTATCAGG 


CTTTTTAGGG 


51 


TACACCAGCA 


CCTGCCCTCC 


ATTCATCCCT 


GTTGGGAGAG 


GGATGGTGTA 


101 


CTGGTTGTCA 


CTAGAGACCT 


AACAGAGTAG 


GGTTAGTGGG 


AGCTTACATT 


151 


TTC AGTGC C A 


TTAACATTCT 


AGTCCAAGGT 


CTTAAATTAT 


TATGTTGAGG 


201 


GGTTTTTTTT 


CCCCTGAGGG 


GGCCGGGGGG 


TGGGGGGAGG 


GTTGATTAGA 


251 


TTCCTTAGGA 


AAGAGGGTTG 


AGACAGACAG 


CAGAGCACTG 


AGCAGTTGGC 


301 


ACTAAAGGAG 


ACCTTGACTA 


GGGGCCAGGT 


GGCATCATCT 


AATCCCAAGG 


351 


GGCTCCAAGT 


GAGTATTAGG 


GTGGGGGAAG 


ACATTATAGA 


AGGAATAGAA 


401 


ACAGGATAGC 


TCAGCCTAAA 


GAAGAGCGGT 


TAAAACCCTA 


CCCACCAGGA 


451 


GTTGACTTGA 


AAGAGGCCCC 


TATGGAGGAA 


TCCCCAACCA 


CCAAAAGCAA 


501 


TCTTGAGCTG 


CAGCTGCTTC 


ATTTAGTGGA 


CCTTGTGTAT 


ATCTGGGTGT 


551 


GTATGCACAT 


AGATAGACAG 


TGAGAAAGAA 


AACTGTTCTT 


CCAGTTCTTT 


601 


TCCAGTGCTA 


CTAGCTTAGG 


GACAGGTTAG 


AACTGTCTGC 


ACAATTGTGT 


651 


GATCATTCCC 


ATTCCCACTT 


CAAAACAAAC 


TGACTGAGAT 


GTTCAACAGA 


701 


AAACTGGCTT 


CAATGGGTAA 


CATGCCCTTG 


CCACTTACTT 


AAGACACTGG 


751 


TGTGATGGGG 


TTTTGAACTC 


CCTATATTTG 


TAGGTATCTG 





Fig. 5b 
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18/21 

Partial sequence of the non-coding RNA gene from makaka 



1 


TTGCTGCAGA 


TACTACTGAC 


CAGACAAGCT 


GTTGACCAGG 


CACCTCCCCT 


51 


CCCGCCCAAA 


CCTTTCCCCC 


ATGTGGTCGT 


TAGAGACAGA 


GCAGTTGAGA 


101 


GGACACTCCC 


GTTTTCGGTG 


CCATCAGTGC 


CCCGTCTACC 


ACTCCCCCAG 


151 


CTCCCCCCAC 


CTCCCCCACT 


CCCAACCACG 


TTGGGACAGG 


GAGGTGTGAG 


201 


GCAGGAGAGA 


CAGTTGGATT 


CTTTAGAGAT 


GGATGTGACC 


AGTGGCTATG 


251 


GCCCGTGCGA 


TCCCACCCGT 


GGCGGCTCAA 


ATCTGGCCCC^ 


ACCCCAGCCC 


301 


CAATCCAAAA 


CTGGCAAGGA 


CGCTTCACAG 


GACAGGAAAG 


TGGCACCTGT 


351 


CTGTTCCGGC 


ATGGCTAGGA 


GGGAG TTG TC 


CCTTGAACTA 


CTGGGTGTAG 


401 


ACTGGCC TAA 


ATCACAGGAG 


AGGATGGCCC 


AGGGTGAGGT 


GGCATGGTCC 


451 


ATTCTCAAGG 


GACGTCCTCC 


AGTTGGTGGC 


ACTAGAGAGG 


CCATGGAGGC 


501 


AGTAGGACAA 


GGCACAGGCA 


GGCTGGCCCA 


GGGTCAGGCC 


GGGCCGAACA 


551 


CAGCGGGGTG 


AGAGGGATTC 


CTCGTCTCAG 


AGCAGTCTGT 


GACCGGTAGT 


601 


TAGGGACTTA 


GTGGACAGGG 


AAGGGGCAAA 


GGGGGAGGAG 


AAGAAAATGT 


651 


TCTTCCAGTT 


ACTTTCCAAT 


TCTACTCCTT 


TAGGGACAGC 


TTAGAATTAT 


701 


TTGCACTATT 


GAGTCTTCAT 


GTTCCCACTT 


CAAAACAAAC 


AGATGCTCTG 


751 


AGAGCAAACT 


GGCTTGAATT 


GGTGACGTTT 


AGTCCCTCAG 


GCCACCAGAT 


801 


GTGATGG TGT 


TGAGAACTAC 


CTGGATATGT 


ATATATACCT 


G 



Fig. 5c 
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Partial sequence of the non-coding RNA gene from orangutan 



1 


TTGCTGCAGA 


TAPTACTGAC 


CAGACAAGCT 


GTTGACCAGG 


CACCTCPPPT 


51 


PPPGCCPAAA 


PPTTTPPPCP 


ATGTGGTCGT 

n x vj x vj\j x vu x 


TAGAGACAGA 


GPAGTTGAGA 


A. \J A. 




VJ X X X X V_ X V? 


CPATPAGTGP 

v— V—rt X v-rtV3 X wv. 


PPPGTPTGPA 

V-V— V- VJ X V_ X ULA 


GP'TPPPPP a n 

VJV. X UU\»UV„ rtw 


J > X. 




PTPPPPPAPT 


CPPAAPPACG 


TTGGGAPAGG 


GAGGTGTGAG 


?oi 

x> \J J. 


gpaggagaga 


X X VJVJrt X X 


PTTTPRARAA 

V — XXX V_. OrivJ/TA 


GATGGATATG 

\J£\ X VJVJrt. X X VJ 


APPAGTGGPP 




A TPn PPTP TCI 

rt X VJlVjJV, V_ X V7 X\J 


PG A TP PP a pp 


» — Vjj X V_yVJ"V__VJvJXV_, X 


PAAGTPTGGP 


PPPAPAPPAG 


01 
Jul 


l_ v-AA luLA 


A A APTPP-P A A 


fifj TV PPPTTP A 


PAGGAPAGGA 


A AGTGGP APP 


j ji 


TPTYJPTP p 


AfiPTPTfifiPA 


TGGPT A GGA C 

X \J\JV_- X rt VJVJrtO 


GGAGTPGTPP 


PTTGAAPTAP 


4fc U J. 


1 \ jv?vj 1\3 X AvjA 


I— 1\AjLL lorul 


L \_ A\— AvjvjAVj a 


nfiATfifJPPPA 

VjVjrt iuuLL V__rt. 


GGGTGAGGTG 




IjvJA x\j<j IV LA 


X X A_ xa_AAvjtVjL-* 


al v? ill ±\_v_a 


nLuuu XLjVjV_Vj 


PTAPA A A PPT" 
V_ X nun/lnUuL 




CATGGAWA.A 


tj 1 AIjvjAv- AAvj 


\jv_\j\_Avj\jv-Avj 


VjC AX^v-V-V-vovj 


VjVjj l\_AvjrfjLLL3 


551 


GGCAGGGCAC 


AGCGGGGTGA 


GAGGGATTCC 


TAATCACTCA 


GAGCAGTGTG 


601 


TGACTGGTAG 


TTAGGGACTC 


AGTGGACAGG 


GGAGGGGCGA 


GGGGGCAGGA 


651 


GAAGAAAATG 


TTCTTCCAGT 


TACTTTCCAA 


TTCTCCTTTA 


GGGACAGCTT 


701 


AGAATTATTT 


GCACTATTGA 


GTCTTCATGT 


TCCCACTTCA 


AAACAAACGA 


751 


TGCTCTGAGA 


GCAAACTGGC 


TTGAATTGGT 


GACATTTAGT 


CCCTCAAGCC 


801 


ACCAGATGTG 


AGTGTTGAGA 


ACTACCTGGA 


TTTGTATATA 


TACCTG 




Fig . 5d 
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4§ m/72Q?f 



Partial sequence of the non-coding RNA gene from rat 





i 


J. x vj v_ x vjv— fivsn 


TACTAPTfiAP 

X AV» X rtV— X UnU 


PAGAPAAGPT 


GTTPAPPAPG 

VJ X X VJrtV— V_tt. VJVJ 


CACTPPPPAP 
v_.n.v_ x v.v<v<v<nv. 




~J X 


ri-tt. v_ x^n.v_ \_ 


V_ V_ V_ X v_ v_ v_ x v_ V_ 


TPAPPPPAPP 

X V_.tt.V_V_ V_V_zn.V-.V_ 


PPTATPPPPT 

V_ V_ X t\ X V_ V_V_V_ X 


RTHTfiPTPAT 

Vj X Vj X VjV_ X V_tt. X 




X \J X 




A ATTPAPAPP 


AP A PTPPP AT 

riVr/lL. X V_V_V_tt. X 


XXX X VjVj 1 VjV_\_ 


APTGATPPPP 
AV_ X VJ.tt. X Vj v_ v_ v_ 




i 

X J X 


TGTPPATAPp 


TTPPPTRAPT 

X X V_.V_.V_ X VJtt>V X 


TTTAPAPPAP 


PPPAAPTPPP 
V<wV>/UiV> X v_ v_ V 


AATPTCARfiR 

X V_ X VJ/AVjVJvJ 






APTnnnAnnT 

^i.V_ X VjrVj\J.rlVJ\J X 


rty; a pn p a r^n 

Vj X VJrlU Vj V — .tt Vj\J 


ARAAAPTATA 

A Vj.rvrVri.V_ X A X A 


«t»7v nri A PTPTT 

X tt.Vj\J.tt\ X V_ X X 


PPP AGA APA P 
Vj Vj vjxA.vj/^-rt.vjtt. v. 




^ J X 


i. X ,tt.VJ AVj J. XV? 


Vj\_ AAVj X Vj.tt. X X 


dCCZCHCC AP,T 

Vj V_ VJ V_ V_ V_ V_.tt.Vj x 


AATTPPAAPT 

AA X X V_V_tt-ttV_ X 


RTfiGTAGPAP 

VJ X VJVJ X tt.VJV_tt.V_ 




Jul 


A A OTPTPPPT 


PP APAPPA AP 
V- v_.tt.v_.fatw- v_tt-tt.v_ 


PPAATPPA A A 


APTV2 AP A AP»P» 
nv. Xvj.tt.v_.tt-tt.vjvj 


AP ATTTTGPA 

nLnX X X X Vj\_tt. 






& & & & ATf- A A A 
ArAA-ft-ri 1 vj AAA 


VJ X \J\JV_A X X X Aj 


TPTVIATPPAP. 

X A_ XVjA XV_V_.tt.V_t 


PTPTViPP A TP 
V_ X V_ X\jVjv_ A X Vj 


PPT AG A PATH 
VjV_ x x^vjtt.vjtt x vj 






Avj IX- 1 1 AAAL 


1X3 1 lobL 1 1 A 


1 AAAv- IajvjL, V_ 


x\jAVj v_ riAv, Avj 


A A C A ClCl A r m/~' 
AAvj Avj Vj A 1 VjVj 






CCCAGACj i AA 


Avj 1 vj IV A 1 LA 


1 v_ 1 vj I I v_Av_A 


AvjVjv_A 1 VjV_ 1 v_ 


f~'Or" - ""p A p» a apm 
v_v_ v_ I AvjAAvj 1 






TCATGCTAAA 


GAAGxuLLA 1 


VjVj AvarvjC AvjC A 


7\ A A ?\ P»*T*7\ 

vjvj AL> AAAVj 1 A 


a PPPTA ppm 
v_AvjVjv_ I A vjvj 1 




551 


GGAGTCAAGC 


CAGGCCTAGT 


GCCACAGAGC 


AAGAGAGCAG 


TCTCTGACTA 




601 


GTAGTTAAGG 


GGGAAGAAAG 


AAAAATATTC 


TTCCAATTGC 


TTTCCAGTTC 




651 


TCCTTTAGGG 


ACAGCTTAGA 


ATTATTTGCA 


CTATTGAGTC 


TTCATGTTCC 


rii 


701 


CACTTCAAAA 


CAAATAGATG 


CTCTGAAAGC 


AAACTGGCTT 


GAAATGGTGA 




751 


CACTGTCCCA 


CAAGCCACCA 


GACAATGGCA 


GTGTTCAGAA 


CTACCTGTAT 




801 


ATGTATATAC 


CTG 
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Partial sequence of the non-coding RNA gene from chimpanzee 



1 


TTGCTGCAGA 


TACTACTGAC 


CAGACAAGCT 


GTTGACCAGG 


CACCTCCCCT 


51 


CCCGCCCAAA 


CCTTTCCCCC 


ATGTGGTCGT 

X* X W X w\J X V^J X 


TAGAGACAGA 


GCGACAGAGC 


101 


AGTTGAGAGG 


ACACTCCCGT 


TTTCGGTGCC 

XXX ^ww X Ww\^ 


ATCAGTGCCC 


CGTCTACAGC 


-L J -X, 


TPPPCCAGCT 


CCCCCCACCT 


CPPPPAPTCC 

\>\.uv\.n\< x>->>v 


C AAPP APG TT 


GGGACAGGGA 


201 


GGTGTGAGGC 

V7VJ x \j x 


AGGAGAGACA 


GTTGGATTCT 

VJ X X Nj\Jii X X V» X 


TTAGAGAAGA 


TGGATATGAC 


251 


PAGTGGPTAT 


GGCCTGTGTG 


ATPPPAPCPG 


TGGTGGCTPA 


AGTCTGGCCC 


301 


PAPAPPAGCC 


CCAATCCAAA 


ACTGGCAAGG 


ACGCTTPAPA 


GGACAGGAAA 


351 




TCTGCTCCAG 


CTCTGGCATG 


GCTAGGAGGG 


GGGAGTCCCT 




TPAAPTAPTG 


GGTGTAGACT 


GGPCTGAACC 


AC AGP AG AGP 


ATGGCCCAGG 






PTPPTPPATT 


PTP A APPPAP 


PTPPTPPA AP 


X V3\JTV^\JV_ X 


DUX 




p a ppp a pt a p 


PAP A JLCZtZHCZt"* 


APPPAPPP TYZ 


PP P P PPPP'TP 


551 


AGGCCGGGCA 


GAGCACAGCG 


GGGTGAGAGG 


GATTCCTAAT 


CACTCAGAGC 


601 


AGTCTGTGAC 


TTAGTGGACA 


GGGGAGGGGG 


CAAAGGGGGA 


GGAGAAGAAA 


651 


ATGTTCTTCC 


AG TTACTTTC 


CAATTCTCCT 


TTAGGGACAG 


CTTAGAATTA 


701 


TTTGCACTAT 


TGAGTCTTCA 


TGTTCCCACT 


TCAAAACAAA 


CAGATGCTCT 


751 


GAGAGCAAAC 


TGGCTTGAAT 


TGGTGACATT 


TAGTCCCTCA 


AGCCACCAGA 


801 


TGTGACAGTG 


TTGAGAACTA 


CCTGGATTTG 


TATATATACC 


TG 
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